Parsing Transcribed Spoken Language
نویسنده
چکیده
This paper investigates some of the challenges that arise when parsing transcribed spoken language, as opposed to parsing written language. In particular, the paper has focus on identifying clauses as the object of syntactic analysis, and parsing data containg production errors.
منابع مشابه
Phonetically aided syntactic parsing of spoken language
The paper presents a technique for parsing a speech utterance from its phonetic representation. The technique is different from a conventional spoken language parsing techniques where a speech utterance is first transcribed at word-level and a syntactic structure is produced from the transcribed words. In a word-level parsing approach, an error caused by a speech recognizer propagates through t...
متن کاملCombining pattern matching and shallow parsing techniques for detecting and correcting spoken language extragrammaticalities
In the context of spoken language understanding systems, we investigated the possibility of normalizing oral utterances. Our approach is based on the integration of different knowledge sources for detecting and correcting spoken language extragrammaticalities. The processing has been done following two main steps: at the first, a lexical processing allows the normalization of oral words and oth...
متن کاملParsing Arabic Dialects
The Arabic language is a collection of spoken dialects with important phonological, morphological, lexical, and syntactic differences, along with a standard written language, Modern Standard Arabic (MSA). Since the spoken dialects are not officially written, it is very costly to obtain adequate corpora to use for training dialect NLP tools such as parsers. In this paper, we address the problem ...
متن کاملClause Boundary Detection in Transcribed Spoken Language
We argue that finite clauses should be regarded as the basic unit in syntactic analysis of spoken language, and describe a method that automatically detects clause boundaries by classifying coordinating conjunctions in spoken language discourse as belonging to either the syntactic level or the discourse level of analysis. The method exploits the special role that coordinating conjunctions play ...
متن کاملHCP with PSMA: A Robust Spoken Language Parser
" Spoken language " is a field of natural language processing, which deals with transcribed speech utterances. The processing of spoken language is much more complex and complicated than processing standard, grammatically correct natural language , and requires special treatment of typical speech phenomena called " disfluencies " , like corrections , interjections and repetitions of words or ph...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006